Consistency of Cross Validation for Comparing Regression Procedures
نویسندگان
چکیده
Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating in size. Furthermore, it can even be of a smaller order than the proportion for estimation while not affecting the consistency property.
منابع مشابه
Cross Validation for Comparing Multiple Density Estimation Procedures
We demonstrate the consistency of cross validation for comparing multiple density estimators using simple inequalities on the likelihood ratio. In nonparametric problems, the splitting of data does not require the domination of test data over the training/estimation data, contrary to Shao (1993). The result is complementary to that of Yang (2005) and Yang (2006).
متن کاملComparing Learning Methods for Classification
We address the consistency property of cross validation (CV) for classification. Sufficient conditions are obtained on the data splitting ratio to ensure that the better classifier between two candidates will be favored by CV with probability approaching 1. Interestingly, it turns out that for comparing two general learning methods, the ratio of the training sample size and the evaluation size ...
متن کاملOverview of the validation procedures for a vaccine production: from R&D level to the pre-qualification stage
Just like any other process, vaccine manufacturing procedures are defined as a series of interrelated functions and activities using a variety of specified actions and equipment designed to produce a defined product. To assure the reproducibility and consistency of such processes, they must be carried out using validated equipment and under the established procedures that meet all the acceptanc...
متن کاملConsistency Properties of Model Selection Criteria in Multiple Linear Regression
This paper concerns the asymptotic properties of a class of criteria for model selection in linear regression models, which covers the most well known criteria as e.g. MALLOWS' Cp, CV (cross-validation), GCV ( generalized cross-validation), AKAIKE's AIC and FPE as well as SCHWARZ' BIC. These criteria are shown to be consistent in the sense of selecting the true or larger models, assuming i.i.d....
متن کاملLocal M - Estimation of Regression Function
In this article, we investigate a robust version of local linear regression smoothers for stationary and censored stochastic processes by using M-type local polynomial techniques and transformations. Under some regularity conditions, we establish the weak and strong consistency as well as the asymptotic normality of proposed estimators. We propose an easily implemented bandwidth selection crite...
متن کامل